Boosting First-Order Clauses for Large, Skewed Data Sets
نویسندگان
چکیده
Creating an e ective ensemble of clauses for large, skewed data sets requires nding a diverse, high-scoring set of clauses and then combining them in such a way as to maximize predictive performance. We have adapted the RankBoost algorithm in order to maximize area under the recall-precision curve, a much better metric when working with highly skewed data sets than ROC curves. We have also explored a range of possibilities for the weak hypotheses used by our modi ed RankBoost algorithm beyond using individual clauses. We provide results on four large, skewed data sets showing that our modi ed RankBoost algorithm outperforms the original on area under the recall-precision curves.
منابع مشابه
Boosting interval based literals
A supervised classification method for time series, even multivariate, is presented. It is based on boosting very simple classifiers: clauses with one literal in the body. The background predicates are based on temporal intervals. Two types of predicates are used: i) relative predicates, such as “increases” and “stays”, and ii) region predicates, such as “always” and “sometime”, which operate o...
متن کاملRewrite-Based Equational Theorem Proving with Selection and Simplification
We present various refutationally complete calculi for first-order clauses with equality that allow for arbitrary selection of negative atoms in clauses. Refutation completeness is established via the use of well-founded orderings on clauses for defining a Herbrand model for a consistent set of clauses. We also formulate an abstract notion of redundancy and show that the deletion of redundant c...
متن کاملTime Series Classification by Boosting Interval Based Literals
A supervised classification method for temporal series, even multivariate, is presented. It is based on boosting very simple classifiers: clauses with one literal in the body. The background predicates are based on temporal intervals. Two types of predicates are used: i) relative predicates, such as “increases” and “stays”, and ii) region predicates, such as “always” and “sometime”, which opera...
متن کاملAsociación Española Para La Inteligencia Artificial España Time Series Classification by Boosting Interval Based Literals *
A supervised classification method for temporal series, even multivariate, is presented. It is based on boosting very simple classifiers: clauses with one literal in the body. The background predicates are based on temporal intervals. Two types of predicates are used: i) relative predicates, such as “increases” and “stays”, and ii) region predicates, such as “always” and “sometime”, which opera...
متن کاملBoosting Descriptive ILP for Predictive Learning
Inductive Logic Programming has been very successful in application to multirelational predictive tasks. Sophisticated predictive ILP systems, such as Progol and foil, can achieve high predictive accuracy, while the learning results remain understandable. Although boosting [1] is an established method to promote predictive accuracy of weak algorithms, there have been relatively few efforts to a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009